Almost-everywhere Algorithmic Stability and Generalization Error

نویسندگان

  • Samuel Kutin
  • Partha Niyogi
چکیده

We introduce a new notion of algorithmic stability, which we call training stability. We show that training stability is sufficient for good bounds on generalization error. These bounds hold even when the learner has infinite VC dimension. In the PAC setting, training stability gives necessary and sufficient conditions for exponential convergence, and thus serves as a distribution-dependent analog to VC dimension. Our proof generalizes an argument of Bousquet and Elisseeff (2001), who show that the more rigid assumption of uniform hypothesis stability implies good bounds on generalization error. We argue that weaker forms of hypothesis stability also give good bounds. We explore the relationships among VC dimension, generalization error, and different notions of stability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The interaction of stability and weakness in AdaBoost

We provide an analysis of AdaBoost within the framework of algorithmic stability. In particular, we show that AdaBoost is a stabilitypreserving operation: if the “input” (the weak learner) to AdaBoost is stable, then the “output” (the strong learner) is almost-everywhere stable. Because classifier combination schemes such as AdaBoost have greatest effect when the weak learner is weak, we discus...

متن کامل

Algorithmic Stability and Learning on Manifolds

The talk consists of two parts: in the first part, we review the notion of algorithmic stability to obtain bounds on generalization error using training error estimates. We introduce the new notion of training stability that is sufficient for tight concentration bounds in general and is both necessary and sufficient for PAC learning. In the second part, we consider several algorithms for which ...

متن کامل

Applications of Empirical Processes in Learning Theory: Algorithmic Stability and Generalization Bounds

This thesis studies two key properties of learning algorithms: their generalization ability and their stability with respect to perturbations. To analyze these properties, we focus on concentration inequalities and tools from empirical process theory. We obtain theoretical results and demonstrate their applications to machine learning. First, we show how various notions of stability upperand lo...

متن کامل

The University of Chicago Algorithmic Stability and Ensemble-based Learning a Dissertation Submitted to the Faculty of the Division of the Physical Sciences in Candidacy for the Degree of Doctor of Philosophy Department of Computer Science by Samuel Kutin

We explore two themes in formal learning theory. We begin with a detailed, general study of the relationship between the generalization error and stability of learning algorithms. We then examine ensemble-based learning from the points of view of stability, decorrelation, and threshold complexity. A central problem of learning theory is bounding generalization error. Most such bounds have been ...

متن کامل

Data-Dependent Stability of Stochastic Gradient Descent

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD) and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worstcase constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002